Welcome to the Notebook

Importing modules

Task 1

In [26]:
import pandas as pd
import numpy as np
import plotly.express as px
import matplotlib.pyplot as plt 
print('modules are imported')
modules are imported

Task 1.1:

Loading the Dataset

In [27]:
dataset_url = 'https://raw.githubusercontent.com/datasets/covid-19/master/data/countries-aggregated.csv'
df = pd.read_csv(dataset_url)

Task 1.2:

let's check the dataframe

In [32]:
df.head()
Out[32]:
Date Country Confirmed Recovered Deaths
0 2020-01-22 Afghanistan 0 0 0
1 2020-01-23 Afghanistan 0 0 0
2 2020-01-24 Afghanistan 0 0 0
3 2020-01-25 Afghanistan 0 0 0
4 2020-01-26 Afghanistan 0 0 0
In [29]:
df.tail()
Out[29]:
Date Country Confirmed Recovered Deaths
161563 2022-04-12 Zimbabwe 247094 0 5460
161564 2022-04-13 Zimbabwe 247160 0 5460
161565 2022-04-14 Zimbabwe 247208 0 5462
161566 2022-04-15 Zimbabwe 247237 0 5462
161567 2022-04-16 Zimbabwe 247237 0 5462

let's check the shape of the dataframe

In [5]:
df.shape
Out[5]:
(161568, 5)

Task 2.1 :

let's do some preprocessing

In [6]:
df = df[df.Confirmed > 0]
In [7]:
df.head()
Out[7]:
Date Country Confirmed Recovered Deaths
33 2020-02-24 Afghanistan 5 0 0
34 2020-02-25 Afghanistan 5 0 0
35 2020-02-26 Afghanistan 5 0 0
36 2020-02-27 Afghanistan 5 0 0
37 2020-02-28 Afghanistan 5 0 0
In [8]:
df[df.Country == 'Italy']
Out[8]:
Date Country Confirmed Recovered Deaths
70185 2020-01-31 Italy 2 0 0
70186 2020-02-01 Italy 2 0 0
70187 2020-02-02 Italy 2 0 0
70188 2020-02-03 Italy 2 0 0
70189 2020-02-04 Italy 2 0 0
... ... ... ... ... ...
70987 2022-04-12 Italy 15404809 0 161032
70988 2022-04-13 Italy 15467395 0 161187
70989 2022-04-14 Italy 15533012 0 161336
70990 2022-04-15 Italy 15595302 0 161469
70991 2022-04-16 Italy 15659835 0 161602

807 rows × 5 columns

let's see Global spread of Covid19

In [9]:
fig = px.choropleth(df, locations= 'Country', locationmode = 'country names', color='Confirmed', animation_frame= 'Date')
fig.update_layout(title_text= 'Global Spread of COVID-19')
fig.show()

Task 2.2 : Exercise

Let's see Global deaths of Covid19

In [10]:
fig = px.choropleth(df, locations= 'Country', locationmode = 'country names', color='Deaths', animation_frame= 'Date')
fig.update_layout(title_text= 'Global Deaths of COVID-19')
fig.show()

Task 3.1:

Let's Visualize how intensive the Covid19 Transmission has been in each of the country

let's start with an example:

In [11]:
df_china = df[df.Country == 'China']
df_china.head()
Out[11]:
Date Country Confirmed Recovered Deaths
30192 2020-01-22 China 548 28 17
30193 2020-01-23 China 643 30 18
30194 2020-01-24 China 920 36 26
30195 2020-01-25 China 1406 39 42
30196 2020-01-26 China 2075 49 56

let's select the columns that we need

In [12]:
df_china = df_china[['Date','Confirmed']]
In [13]:
df_china.head()
Out[13]:
Date Confirmed
30192 2020-01-22 548
30193 2020-01-23 643
30194 2020-01-24 920
30195 2020-01-25 1406
30196 2020-01-26 2075

calculating the first derivation of confrimed column

In [14]:
df_china['Infection Rate']= df_china['Confirmed'].diff()
In [15]:
df_china.head()
Out[15]:
Date Confirmed Infection Rate
30192 2020-01-22 548 NaN
30193 2020-01-23 643 95.0
30194 2020-01-24 920 277.0
30195 2020-01-25 1406 486.0
30196 2020-01-26 2075 669.0
In [16]:
px.line(df_china , x = 'Date', y = ['Confirmed', 'Infection Rate'])
In [17]:
df_china['Infection Rate'].max()
Out[17]:
77402.0

Task 3.2:

Let's Calculate Maximum infection rate for all of the countries

In [18]:
df.head()
Out[18]:
Date Country Confirmed Recovered Deaths
33 2020-02-24 Afghanistan 5 0 0
34 2020-02-25 Afghanistan 5 0 0
35 2020-02-26 Afghanistan 5 0 0
36 2020-02-27 Afghanistan 5 0 0
37 2020-02-28 Afghanistan 5 0 0
In [24]:
countries = list(df['Country'].unique())
max_infection_rates = []
for c in countries : 
    MIR = df[df.Country == c].Confirmed.diff().max()
    max_infection_rates.append(MIR)
    
print(max_infection_rates)
[3243.0, 4789.0, 2521.0, 2313.0, 5035.0, 0.0, 468.0, 139853.0, 4388.0, 175271.0, 58583.0, 7779.0, 1497.0, 8173.0, 16230.0, 1329.0, 8921.0, 133480.0, 1517.0, 2566.0, 2291.0, 23611.0, 5254.0, 41576.0, 287149.0, 7380.0, 12399.0, 1005.0, 7083.0, 4710.0, 1469.0, 1130.0, 9668.0, 63808.0, 4044.0, 596.0, 41651.0, 77402.0, 35575.0, 275.0, 1188.0, 4481.0, 18188.0, 2858.0, 11812.0, 9907.0, 6494.0, 57378.0, 55709.0, 99.0, 415.0, 392.0, 7439.0, 17670.0, 5516.0, 12677.0, 1750.0, 282.0, 8438.0, 1642.0, 5185.0, 1854.0, 28891.0, 503349.0, 1871.0, 587.0, 26320.0, 527487.0, 2521.0, 50182.0, 902.0, 5826.0, 534.0, 191.0, 1186.0, 737.0, 7.0, 12890.0, 45047.0, 7408.0, 414188.0, 64718.0, 50228.0, 13515.0, 43199.0, 243295.0, 228123.0, 1968.0, 104345.0, 25502.0, 66121.0, 3749.0, 350.0, 621317.0, 4397.0, 6913.0, 11505.0, 3915.0, 11992.0, 10760.0, 6925.0, 447.0, 5694.0, 349.0, 12968.0, 5497.0, 7.0, 2295.0, 1316.0, 33406.0, 2838.0, 1217.0, 1677.0, 3.0, 1211.0, 30006.0, 109895.0, 0.0, 6199.0, 520.0, 24556.0, 2960.0, 12039.0, 4947.0, 3268.0, 10052.0, 380498.0, 39814.0, 718.0, 301.0, 6158.0, 2332.0, 26109.0, 6146.0, 12073.0, 371.0, 19722.0, 1543.0, 25833.0, 99645.0, 38867.0, 57659.0, 75276.0, 4206.0, 40018.0, 202211.0, 3072.0, 238.0, 722.0, 2723.0, 701.0, 491.0, 319.0, 5928.0, 1722.0, 36737.0, 2068.0, 192.0, 39252.0, 28504.0, 23332.0, 681.0, 1066.0, 37875.0, 503.0, 372766.0, 11366.0, 1284.0, 112.0, 1404.0, 138985.0, 89462.0, 905.0, 1348.0, 407.0, 24307.0, 52284.0, 532.0, 1002.0, 832.0, 1259.0, 19923.0, 823225.0, 1383795.0, 21324.0, 45022.0, 4471.0, 848169.0, 13612.0, 1478.0, 566.0, 4418.0, 454212.0, 30356.0, 55.0, 287.0, 5555.0, 9185.0]

Task 3.3:

let's create a new Dataframe

In [25]:
df_MIR = pd.DataFrame()
df_MIR['Country'] = countries
df_MIR['Max Infection Rate'] = max_infection_rates
df_MIR.head()
Out[25]:
Country Max Infection Rate
0 Afghanistan 3243.0
1 Albania 4789.0
2 Algeria 2521.0
3 Andorra 2313.0
4 Angola 5035.0

Let's plot the barchart : maximum infection rate of each country

In [33]:
px.bar(df_MIR, x = 'Country' ,  y='Max Infection Rate', color = 'Country', title = 'Global Maximum Infection Rate')

Task 4: Let's See how National Lockdowns Impacts Covid19 transmission in Italy

COVID19 pandemic lockdown in Italy

On 9 March 2020, the government of Italy under Prime Minister Giuseppe Conte imposed a national quarantine, restricting the movement of the population except for necessity, work, and health circumstances, in response to the growing pandemic of COVID-19 in the country. source

In [34]:
italy_lockdown_start_date = '2020-03-09'
italy_lockdown_a_month_later = '2020-04-09'
In [35]:
df.head()
Out[35]:
Date Country Confirmed Recovered Deaths
0 2020-01-22 Afghanistan 0 0 0
1 2020-01-23 Afghanistan 0 0 0
2 2020-01-24 Afghanistan 0 0 0
3 2020-01-25 Afghanistan 0 0 0
4 2020-01-26 Afghanistan 0 0 0

let's get data related to italy

In [37]:
df_italy = df[df.Country == 'Italy']

lets check the dataframe

In [38]:
df_italy.head()
Out[38]:
Date Country Confirmed Recovered Deaths
70176 2020-01-22 Italy 0 0 0
70177 2020-01-23 Italy 0 0 0
70178 2020-01-24 Italy 0 0 0
70179 2020-01-25 Italy 0 0 0
70180 2020-01-26 Italy 0 0 0

let's calculate the infection rate in Italy

In [40]:
df_italy['Infection Rate'] = df_italy.Confirmed.diff()
df_italy.head()
<ipython-input-40-5496adff108b>:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Out[40]:
Date Country Confirmed Recovered Deaths Infection Rate
70176 2020-01-22 Italy 0 0 0 NaN
70177 2020-01-23 Italy 0 0 0 0.0
70178 2020-01-24 Italy 0 0 0 0.0
70179 2020-01-25 Italy 0 0 0 0.0
70180 2020-01-26 Italy 0 0 0 0.0

ok! now let's do the visualization

In [51]:
fig = px.line(df_italy , x= 'Date', y = 'Infection Rate', title = "Before and After Lockdown in Italy")
fig.add_shape(
       dict(
       type="line",
        x0=italy_lockdown_start_date,
        y0=0, 
        x1= italy_lockdown_start_date,
        y1= df_italy['Infection Rate'].max(),
        line= dict(color='red', width=2)
           
       )
    
)
fig.add_annotation(
     dict(
     x = italy_lockdown_start_date,
     y = df_italy['Infection Rate'].max(),
     text = "starting date of the lockdown"
     )
     
)

Task 5: Let's See how National Lockdowns Impacts Covid19 active cases in Italy

In [49]:
df_italy.head()
Out[49]:
Date Country Confirmed Recovered Deaths Infection Rate
70176 2020-01-22 Italy 0 0 0 NaN
70177 2020-01-23 Italy 0 0 0 0.0
70178 2020-01-24 Italy 0 0 0 0.0
70179 2020-01-25 Italy 0 0 0 0.0
70180 2020-01-26 Italy 0 0 0 0.0

let's calculate number of active cases day by day

In [52]:
df_italy['Deaths Rate'] = df_italy.Deaths.diff()
<ipython-input-52-f32d7c1f800d>:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

let's check the dataframe again

In [53]:
df_italy.head()
Out[53]:
Date Country Confirmed Recovered Deaths Infection Rate Deaths Rate
70176 2020-01-22 Italy 0 0 0 NaN NaN
70177 2020-01-23 Italy 0 0 0 0.0 0.0
70178 2020-01-24 Italy 0 0 0 0.0 0.0
70179 2020-01-25 Italy 0 0 0 0.0 0.0
70180 2020-01-26 Italy 0 0 0 0.0 0.0

now let's plot a line chart to compare COVID19 national lockdowns impacts on spread of the virus and number of active cases

In [55]:
fig= px.line(df_italy, x='Date', y=['Infection Rate', 'Deaths Rate'])
fig.show()

COVID19 pandemic lockdown in Germany

Lockdown was started in Freiburg, Baden-Württemberg and Bavaria on 20 March 2020. Three days later, it was expanded to the whole of Germany

In [56]:
Germany_lockdown_start_date = '2020-03-23' 
Germany_lockdown_a_month_later = '2020-04-23'

let's select the data related to Germany

In [57]:
df_germany = df[df.Country == 'Germany']

let's check the dataframe

In [58]:
df_germany.head()
Out[58]:
Date Country Confirmed Recovered Deaths
54672 2020-01-22 Germany 0 0 0
54673 2020-01-23 Germany 0 0 0
54674 2020-01-24 Germany 0 0 0
54675 2020-01-25 Germany 0 0 0
54676 2020-01-26 Germany 0 0 0

selecting the needed column

In [ ]:
 

let's check it again

In [60]:
df_germany.head()
Out[60]:
Date Country Confirmed Recovered Deaths Infection Rate Deaths Rate
54672 2020-01-22 Germany 0 0 0 NaN NaN
54673 2020-01-23 Germany 0 0 0 0.0 0.0
54674 2020-01-24 Germany 0 0 0 0.0 0.0
54675 2020-01-25 Germany 0 0 0 0.0 0.0
54676 2020-01-26 Germany 0 0 0 0.0 0.0

let's calculate the infection rate in Germany

In [ ]:
 

let's check the dataframe

In [ ]:
 

now let's plot the line chart

In [ ]:
 
In [ ]:
 

let's do some scaling and plot the line chart!

In [62]:
df_germany['Infection Rate'] = df_germany['Infection Rate']/df_germany['Infection Rate'].max()
df_germany['Deaths Rate'] = df_germany['Deaths Rate']/df_germany['Deaths Rate'].max()
<ipython-input-62-2addb67456eb>:1: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

<ipython-input-62-2addb67456eb>:2: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

In [72]:
fig = px.line(df_germany, x='Date', y=['Infection Rate', 'Deaths Rate'])
fig.add_shape(
  dict(
  type="line",
  x0 = Germany_lockdown_start_date,
  y0 = 0,
  x1=Germany_lockdown_start_date,
  y1= df_germany['Infection Rate'].max(),
  line = dict(color='black', width=2)
  )
)

fig.add_annotation(
   dict(
   x= Germany_lockdown_start_date,
   y= df_germany['Infection Rate'].max(),
   text = 'starting date of the lockdown'
   )
)

fig.add_shape(
  dict(
  type="line",
  x0=Germany_lockdown_a_month_later,
  y0=0,
  x1=Germany_lockdown_a_month_later,
  y1=df_germany['Infection Rate'].max(),
  line = dict(color='yellow', width=2)
  )
  
)
fig.add_annotation(
  dict(
  x = Germany_lockdown_a_month_later,
  y=0,
  text = 'a month later'
  )
)
In [ ]: